Explainable Boosting Machine#
See the reference paper for full details [1]. Link
Summary#
Explainable Boosting Machine (EBM) is a tree-based, cyclic gradient boosting Generalized Additive Model with automatic interaction detection. EBMs are often as accurate as state-of-the-art blackbox models while remaining completely interpretable. Although EBMs are often slower to train than other modern algorithms, EBMs are extremely compact and fast at prediction time.
How it Works#
As part of the framework, InterpretML also includes a new interpretability algorithm – the Explainable Boosting Machine (EBM). EBM is a glassbox model, designed to have accuracy comparable to state-of-the-art machine learning methods like Random Forest and Boosted Trees, while being highly intelligibile and explainable. EBM is a generalized additive model (GAM) of the form:
where \(g\) is the link function that adapts the GAM to different settings such as regression or classification.
EBM has a few major improvements over traditional GAMs [2]. First, EBM learns each feature function \(f_j\) using modern machine learning techniques such as bagging and gradient boosting. The boosting procedure is carefully restricted to train on one feature at a time in round-robin fashion using a very low learning rate so that feature order does not matter. It round-robin cycles through features to mitigate the effects of co-linearity and to learn the best feature function \(f_j\) for each feature to show how each feature contributes to the model’s prediction for the problem. Second, EBM can automatically detect and include pairwise interaction terms of the form:
which further increases accuracy while maintaining intelligibility. EBM is a fast implementation of the GA2M algorithm [1], written in C++ and Python. The implementation is parallelizable, and takes advantage of joblib to provide multi-core and multi-machine parallelization. The algorithmic details for the training procedure, selection of pairwise interaction terms, and case studies can be found in [1, 3, 4].
EBMs are highly intelligible because the contribution of each feature to a final prediction can be visualized and understood by plotting \(f_j\). Because EBM is an additive model, each feature contributes to predictions in a modular way that makes it easy to reason about the contribution of each feature to the prediction.
To make individual predictions, each function \(f_j\) acts as a lookup table per feature, and returns a term contribution. These term contributions are simply added up, and passed through the link function \(g\) to compute the final prediction. Because of the modularity (additivity), term contributions can be sorted and visualized to show which features had the most impact on any individual prediction.
To keep the individual terms additive, EBM pays an additional training cost, making it somewhat slower than similar methods. However, because making predictions involves simple additions and lookups inside of the feature functions \(f_j\), EBMs are one of the fastest models to execute at prediction time. EBM’s light memory usage and fast predict times makes it particularly attractive for model deployment in production.
If you find video as a better medium for learning the algorithm, you can find a conceptual overview of the algorithm below:

Code Example#
The following code will train an EBM classifier for the adult income dataset. The visualizations provided will be for both global and local explanations.
from interpret import set_visualize_provider
from interpret.provider import InlineProvider
set_visualize_provider(InlineProvider())
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from interpret.glassbox import ExplainableBoostingClassifier
from interpret import show
df = pd.read_csv(
"https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data",
header=None)
df.columns = [
"Age", "WorkClass", "fnlwgt", "Education", "EducationNum",
"MaritalStatus", "Occupation", "Relationship", "Race", "Gender",
"CapitalGain", "CapitalLoss", "HoursPerWeek", "NativeCountry", "Income"
]
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
seed = 42
np.random.seed(seed)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed)
ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)
auc = roc_auc_score(y_test, ebm.predict_proba(X_test)[:, 1])
print("AUC: {:.3f}".format(auc))
AUC: 0.929
show(ebm.explain_global())
show(ebm.explain_local(X_test[:5], y_test[:5]), 0)
Further Resources#
Bibliography#
[1] Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 623–631. 2013. Paper Link
[2] Trevor Hastie and Robert Tibshirani. Generalized additive models: some applications. Journal of the American Statistical Association, 82(398):371–386, 1987.
[3] Yin Lou, Rich Caruana, and Johannes Gehrke. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 150–158. 2012. Paper Link
[4] Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 1721–1730. 2015. Paper Link
API#
ExplainableBoostingClassifier#
- class interpret.glassbox.ExplainableBoostingClassifier(feature_names=None, feature_types=None, max_bins=256, max_interaction_bins=32, interactions=10, exclude=[], validation_size=0.15, outer_bags=8, inner_bags=0, learning_rate=0.01, greediness=0.0, smoothing_rounds=0, max_rounds=5000, early_stopping_rounds=50, early_stopping_tolerance=0.0001, min_samples_leaf=2, max_leaves=3, objective='log_loss', n_jobs=- 2, random_state=42)#
An Explainable Boosting Classifier
- Parameters:
feature_names (list of str, default=None) – List of feature names.
feature_types (list of FeatureType, default=None) –
List of feature types. FeatureType can be:
None: Auto-detect
’quantile’: Continuous with equal density bins
’rounded_quantile’: Continuous with quantile bins, but the cut values are rounded when possible
’uniform’: Continuous with equal width bins
’winsorized’: Continuous with equal width bins, but the leftmost and rightmost cut are chosen by quantiles
’continuous’: Use the default binning for continuous features, which is ‘quantile’ currently
[List of float]: Continuous with specified cut values. Eg: [5.5, 8.75]
[List of str]: Ordinal categorical where the order has meaning. Eg: [“low”, “medium”, “high”]
’ordinal’: Ordinal categorical where the order is determined by sorting the feature strings
’nominal’: Categorical where the order has no meaning. Eg: country names
max_bins (int, default=256) – Max number of bins per feature for the main effects stage.
max_interaction_bins (int, default=32) – Max number of bins per feature for interaction terms.
interactions (int, float, or list of tuples of feature indices, default=10) –
Interaction terms to be included in the model. Options are:
Integer (1 <= interactions): Count of interactions to be automatically selected
Percentage (interactions < 1.0): Determine the integer count of interactions by multiplying the number of features by this percentage
List of tuples: The tuples contain the indices of the features within the additive term
exclude ('mains' or list of tuples of feature indices|names, default=[]) – Features or terms to be excluded.
validation_size (int or float, default=0.15) –
Validation set size. Used for early stopping during boosting, and is needed to create outer bags.
Integer (1 <= validation_size): Count of samples to put in the validation sets
Percentage (validation_size < 1.0): Percentage of the data to put in the validation sets
0: Turns off early stopping. Outer bags have no utility. Error bounds will be eliminated
outer_bags (int, default=8) – Number of outer bags. Outer bags are used to generate error bounds and help with smoothing the graphs.
inner_bags (int, default=0) – Number of inner bags. 0 turns off inner bagging.
learning_rate (float, default=0.01) – Learning rate for boosting.
greediness (float, default=0.0) – Percentage of rounds where boosting is greedy instead of round-robin. Greedy rounds are intermixed with cyclic rounds.
smoothing_rounds (int, default=0) – Number of initial highly regularized rounds to set the basic shape of the main effect feature graphs.
max_rounds (int, default=5000) – Total number of boosting rounds with n_terms boosting steps per round.
early_stopping_rounds (int, default=50) – Number of rounds with no improvement to trigger early stopping. 0 turns off early stopping and boosting will occur for exactly max_rounds.
early_stopping_tolerance (float, default=1e-4) – Tolerance that dictates the smallest delta required to be considered an improvement.
min_samples_leaf (int, default=2) – Minimum number of samples allowed in the leaves.
max_leaves (int, default=3) – Maximum number of leaves allowed in each tree.
objective (str, default="log_loss") – The objective to optimize.
n_jobs (int, default=-2) – Number of jobs to run in parallel. Negative integers are interpreted as following joblib’s formula (n_cpus + 1 + n_jobs), just like scikit-learn. Eg: -2 means using all threads except 1.
random_state (int or None, default=42) – Random state. None uses device_random and generates non-repeatable sequences.
- Variables:
classes_ (array of bool, int, or unicode with shape
(n_classes,)) – The class labels.n_features_in_ (int) – Number of features.
feature_names_in_ (List of str) – Resolved feature names. Names can come from feature_names, X, or be auto-generated.
feature_types_in_ (List of str) – Resolved feature types. Can be: ‘continuous’, ‘nominal’, or ‘ordinal’.
bins_ (List[Union[List[Dict[str, int]], List[array of float with shape
(n_cuts,)]]]) – Per-feature list that defines how to bin each feature. Each feature in the list contains a list of binning resolutions. The first item in the binning resolution list is for binning main effect features. If there are more items in the binning resolution list, they define the binning for successive levels of resolutions. The item at index 1, if it exists, defines the binning for pairs. The last binning resolution defines the bins for all successive interaction levels. If the binning resolution list contains dictionaries, then the feature is either a ‘nominal’ or ‘ordinal’ categorical. If the binning resolution list contains arrays, then the feature is ‘continuous’ and the arrays will contain float cut points that separate continuous values into bins.feature_bounds_ (array of float with shape
(n_features, 2)) – min/max bounds for each feature. feature_bounds_[feature_index, 0] is the min value of the feature and feature_bounds_[feature_index, 1] is the max value of the feature. Categoricals have min & max values of NaN.histogram_edges_ (List of None or array of float with shape
(n_hist_edges,)) – Per-feature list of the histogram edges. Categorical features contain None within the List at their feature index.histogram_weights_ (List of array of float with shape
(n_hist_bins,)) – Per-feature list of the total sample weights within each feature’s histogram bins.unique_val_counts_ (array of int with shape
(n_features,)) – Per-feature count of the number of unique feature values.term_features_ (List of tuples of feature indices) – Additive terms used in the model and their component feature indices.
term_names_ (List of str) – List of term names.
bin_weights_ (List of array of float with shape
(n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the total sample weights in each term’s tensor bins.bagged_scores_ (List of array of float with shape
(n_outer_bags, n_feature0_bins, ..., n_featureN_bins, n_classes)or(n_outer_bags, n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the bagged model scores. The last dimension of length n_classes is dropped for binary classification.term_scores_ (List of array of float with shape
(n_feature0_bins, ..., n_featureN_bins, n_classes)or(n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the model scores. The last dimension of length n_classes is dropped for binary classification.standard_deviations_ (List of array of float with shape
(n_feature0_bins, ..., n_featureN_bins, n_classes)or(n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the standard deviations of the bagged model scores. The last dimension of length n_classes is dropped for binary classification.bag_weights_ (array of float with shape
(n_outer_bags,)) – Per-bag record of the total weight within each bag.breakpoint_iteration_ (array of int with shape
(n_stages, n_outer_bags)) – The number of boosting rounds performed within each stage until either early stopping, or the max_rounds was reached. Normally, the count of main effects boosting rounds will be in breakpoint_iteration_[0], and the count of interaction boosting rounds will be in breakpoint_iteration_[1].intercept_ (array of float with shape
(n_classes,)or(1,)) – Intercept of the model. Binary classification is shape(1,), and multiclass is shape(n_classes,).
- decision_function(X, init_score=None)#
Predict scores from model before calling the link function.
- Parameters:
X – Numpy array for samples.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
The sum of the additive term contributions.
- explain_global(name=None)#
Provides global explanation for model.
- Parameters:
name – User-defined explanation name.
- Returns:
An explanation object, visualizing feature-value pairs as horizontal bar chart.
- explain_local(X, y=None, name=None, init_score=None)#
Provides local explanations for provided samples.
- Parameters:
X – Numpy array for X to explain.
y – Numpy vector for y to explain.
name – User-defined explanation name.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
An explanation object, visualizing feature-value pairs for each sample as horizontal bar charts.
- fit(X, y, sample_weight=None, init_score=None)#
Fits model to provided samples.
- Parameters:
X – Numpy array for training samples.
y – Numpy array as training labels.
sample_weight – Optional array of weights per sample. Should be same length as X and y.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
Itself.
- monotonize(term, increasing='auto')#
Adjusts a term to be monotone using isotonic regression.
- Parameters:
term – Index or name of continuous univariate term to apply monotone constraints
increasing – ‘auto’ or bool. ‘auto’ decides direction based on Spearman correlation estimate.
- Returns:
Itself.
- predict(X, init_score=None)#
Predicts on provided samples.
- Parameters:
X – Numpy array for samples.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
Predicted class label per sample.
- predict_and_contrib(X, output='probabilities', init_score=None)#
Predicts on provided samples, returning predictions and explanations for each sample.
- Parameters:
X – Numpy array for samples.
output – Prediction type to output (i.e. one of ‘probabilities’, ‘labels’, ‘logits’)
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
Predictions and local explanations for each sample.
- predict_proba(X, init_score=None)#
Probability estimates on provided samples.
- Parameters:
X – Numpy array for samples.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
Probability estimate of sample for each class.
- score(X, y, sample_weight=None)#
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns:
score – Mean accuracy of
self.predict(X)w.r.t. y.- Return type:
float
- term_importances(importance_type='avg_weight')#
Provides the term importances
- Parameters:
importance_type – the type of term importance requested (‘avg_weight’, ‘min_max’)
- Returns:
An array term importances with one importance per additive term
ExplainableBoostingRegressor#
- class interpret.glassbox.ExplainableBoostingRegressor(feature_names=None, feature_types=None, max_bins=256, max_interaction_bins=32, interactions=10, exclude=[], validation_size=0.15, outer_bags=8, inner_bags=0, learning_rate=0.01, greediness=0.0, smoothing_rounds=0, max_rounds=5000, early_stopping_rounds=50, early_stopping_tolerance=0.0001, min_samples_leaf=2, max_leaves=3, objective='rmse', n_jobs=- 2, random_state=42)#
An Explainable Boosting Regressor
- Parameters:
feature_names (list of str, default=None) – List of feature names.
feature_types (list of FeatureType, default=None) –
List of feature types. FeatureType can be:
None: Auto-detect
’quantile’: Continuous with equal density bins
’rounded_quantile’: Continuous with quantile bins, but the cut values are rounded when possible
’uniform’: Continuous with equal width bins
’winsorized’: Continuous with equal width bins, but the leftmost and rightmost cut are chosen by quantiles
’continuous’: Use the default binning for continuous features, which is ‘quantile’ currently
[List of float]: Continuous with specified cut values. Eg: [5.5, 8.75]
[List of str]: Ordinal categorical where the order has meaning. Eg: [“low”, “medium”, “high”]
’ordinal’: Ordinal categorical where the order is determined by sorting the feature strings
’nominal’: Categorical where the order has no meaning. Eg: country names
max_bins (int, default=256) – Max number of bins per feature for the main effects stage.
max_interaction_bins (int, default=32) – Max number of bins per feature for interaction terms.
interactions (int, float, or list of tuples of feature indices, default=10) –
Interaction terms to be included in the model. Options are:
Integer (1 <= interactions): Count of interactions to be automatically selected
Percentage (interactions < 1.0): Determine the integer count of interactions by multiplying the number of features by this percentage
List of tuples: The tuples contain the indices of the features within the additive term
exclude ('mains' or list of tuples of feature indices|names, default=[]) – Features or terms to be excluded.
validation_size (int or float, default=0.15) –
Validation set size. Used for early stopping during boosting, and is needed to create outer bags.
Integer (1 <= validation_size): Count of samples to put in the validation sets
Percentage (validation_size < 1.0): Percentage of the data to put in the validation sets
0: Turns off early stopping. Outer bags have no utility. Error bounds will be eliminated
outer_bags (int, default=8) – Number of outer bags. Outer bags are used to generate error bounds and help with smoothing the graphs.
inner_bags (int, default=0) – Number of inner bags. 0 turns off inner bagging.
learning_rate (float, default=0.01) – Learning rate for boosting.
greediness (float, default=0.0) – Percentage of rounds where boosting is greedy instead of round-robin. Greedy rounds are intermixed with cyclic rounds.
smoothing_rounds (int, default=0) – Number of initial highly regularized rounds to set the basic shape of the main effect feature graphs.
max_rounds (int, default=5000) – Total number of boosting rounds with n_terms boosting steps per round.
early_stopping_rounds (int, default=50) – Number of rounds with no improvement to trigger early stopping. 0 turns off early stopping and boosting will occur for exactly max_rounds.
early_stopping_tolerance (float, default=1e-4) – Tolerance that dictates the smallest delta required to be considered an improvement.
min_samples_leaf (int, default=2) – Minimum number of samples allowed in the leaves.
max_leaves (int, default=3) – Maximum number of leaves allowed in each tree.
objective (str, default="rmse") – The objective to optimize. Options include: “rmse”, “gamma_deviance”, “poisson_deviance:max_delta_step=0.7”, “pseudo_huber:delta=1.0”, “rmse_log” (rmse with a log link function)
n_jobs (int, default=-2) – Number of jobs to run in parallel. Negative integers are interpreted as following joblib’s formula (n_cpus + 1 + n_jobs), just like scikit-learn. Eg: -2 means using all threads except 1.
random_state (int or None, default=42) – Random state. None uses device_random and generates non-repeatable sequences.
- Variables:
n_features_in_ (int) – Number of features.
feature_names_in_ (List of str) – Resolved feature names. Names can come from feature_names, X, or be auto-generated.
feature_types_in_ (List of str) – Resolved feature types. Can be: ‘continuous’, ‘nominal’, or ‘ordinal’.
bins_ (List[Union[List[Dict[str, int]], List[array of float with shape
(n_cuts,)]]]) – Per-feature list that defines how to bin each feature. Each feature in the list contains a list of binning resolutions. The first item in the binning resolution list is for binning main effect features. If there are more items in the binning resolution list, they define the binning for successive levels of resolutions. The item at index 1, if it exists, defines the binning for pairs. The last binning resolution defines the bins for all successive interaction levels. If the binning resolution list contains dictionaries, then the feature is either a ‘nominal’ or ‘ordinal’ categorical. If the binning resolution list contains arrays, then the feature is ‘continuous’ and the arrays will contain float cut points that separate continuous values into bins.feature_bounds_ (array of float with shape
(n_features, 2)) – min/max bounds for each feature. feature_bounds_[feature_index, 0] is the min value of the feature and feature_bounds_[feature_index, 1] is the max value of the feature. Categoricals have min & max values of NaN.histogram_edges_ (List of None or array of float with shape
(n_hist_edges,)) – Per-feature list of the histogram edges. Categorical features contain None within the List at their feature index.histogram_weights_ (List of array of float with shape
(n_hist_bins,)) – Per-feature list of the total sample weights within each feature’s histogram bins.unique_val_counts_ (array of int with shape
(n_features,)) – Per-feature count of the number of unique feature values.term_features_ (List of tuples of feature indices) – Additive terms used in the model and their component feature indices.
term_names_ (List of str) – List of term names.
bin_weights_ (List of array of float with shape
(n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the total sample weights in each term’s tensor bins.bagged_scores_ (List of array of float with shape
(n_outer_bags, n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the bagged model scores.term_scores_ (List of array of float with shape
(n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the model scores.standard_deviations_ (List of array of float with shape
(n_feature0_bins, ..., n_featureN_bins)) – Per-term list of the standard deviations of the bagged model scores.bag_weights_ (array of float with shape
(n_outer_bags,)) – Per-bag record of the total weight within each bag.breakpoint_iteration_ (array of int with shape
(n_stages, n_outer_bags)) – The number of boosting rounds performed within each stage until either early stopping, or the max_rounds was reached. Normally, the count of main effects boosting rounds will be in breakpoint_iteration_[0], and the count of interaction boosting rounds will be in breakpoint_iteration_[1].intercept_ (float) – Intercept of the model.
min_target_ (float) – The minimum value found in ‘y’.
max_target_ (float) – The maximum value found in ‘y’.
- decision_function(X, init_score=None)#
Predict scores from model before calling the link function.
- Parameters:
X – Numpy array for samples.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
The sum of the additive term contributions.
- explain_global(name=None)#
Provides global explanation for model.
- Parameters:
name – User-defined explanation name.
- Returns:
An explanation object, visualizing feature-value pairs as horizontal bar chart.
- explain_local(X, y=None, name=None, init_score=None)#
Provides local explanations for provided samples.
- Parameters:
X – Numpy array for X to explain.
y – Numpy vector for y to explain.
name – User-defined explanation name.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
An explanation object, visualizing feature-value pairs for each sample as horizontal bar charts.
- fit(X, y, sample_weight=None, init_score=None)#
Fits model to provided samples.
- Parameters:
X – Numpy array for training samples.
y – Numpy array as training labels.
sample_weight – Optional array of weights per sample. Should be same length as X and y.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
Itself.
- monotonize(term, increasing='auto')#
Adjusts a term to be monotone using isotonic regression.
- Parameters:
term – Index or name of continuous univariate term to apply monotone constraints
increasing – ‘auto’ or bool. ‘auto’ decides direction based on Spearman correlation estimate.
- Returns:
Itself.
- predict(X, init_score=None)#
Predicts on provided samples.
- Parameters:
X – Numpy array for samples.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
Predicted class label per sample.
- predict_and_contrib(X, init_score=None)#
Predicts on provided samples, returning predictions and explanations for each sample.
- Parameters:
X – Numpy array for samples.
init_score – Optional. Either a model that can generate scores or per-sample initialization score. If samples scores it should be the same length as X.
- Returns:
Predictions and local explanations for each sample.
- score(X, y, sample_weight=None)#
Return the coefficient of determination of the prediction.
The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns:
score – \(R^2\) of
self.predict(X)w.r.t. y.- Return type:
float
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score(). This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- term_importances(importance_type='avg_weight')#
Provides the term importances
- Parameters:
importance_type – the type of term importance requested (‘avg_weight’, ‘min_max’)
- Returns:
An array term importances with one importance per additive term